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PATENT 

Attorney Docket No. 00939A-035010 



HIGH SPEED VIDEO FRAME BUFFER 

BACKGROUND OF THE INVENTION 

This invention relates in general to video electronics, and in particular to a 
high speed video display memory using dynamic memory cells implemented on the same 
chip as the video display controller. 

A typical computer system includes a video card that carries the circuitry 
for processing video signals and for driving the display panel. Figure 1 shows a 
conventional video card 100 that includes a display memory chip 102 (sometimes 
referred to as a frame buffer) connected to a controller chip 104 via input/output (I/O) 
pins 106. Display memory 102 stores data that represent the color or intensity of light 
for every picture cell (pixel) on the video screen, and controller 104 processes the data 
and drives the display. A drawback of this type of system is limited bandwidth between 
the memory and the controller caused by the limited number of data input/output pins 
106 on the two chips. 

It is desirable to substantially increase the rate of data transfer between the 
video memory and the video processor. Using a memory system with multiple banks N 
improves the bandwidth somewhat. For example, dual-bank video memories have been 
developed whereby two word lines one from each bank can be selected at the same time. 
While some improvement is achieved by this design, still higher bandwidths are required. 

Integrating both the memory circuit and the controller on the same chip is 
a solution that promises a significant increase in the bandwidth. With the memory on the 
same chip as the processor, instead of e.g., 32 bits over 32 I/O pins, 128 or 256 bits can 
be accessed internally at very high speeds. 



SUMMARY OF THE INVENTION 

The present invention offers an improved video memory circuit that is 
integrated on the same chip as the video controller. The memory circuit is arranged in a 
plurality of memory cell arrays that are separated by clusters of sense amplifiers. Each 
cluster of sense amplifiers is shared by two adjacent dynamic memory arrays resulting in 
a compact design that minimizes circuit area. 

In a typical dynamic memory, such as a dynamic random access memory 
(DRAM), access to a given cell usually occurs in two steps. First a row is open then a 
column within that row is selected. Access to a column in a previously open row is 
relatively fast while access to a column in any other row is slow. Instead of activating 
an array only when a word line from that array is selected and then turning the array off 
after the data has been accessed, the present invention maintains the maximum number of 
arrays activated at any given time. That is, once an array is selected, it is not turned off 1 
until it receives a command from the processor selecting a new row in that array or an \ 
array a djacent to it. Because in the memory circuit of the present invention adjacent 
arrays share the same group of sense amplifiers, when the memory receives a new 
command selecting a word line from array N, any previously selected word lines from 
array N as well as arrays N-l and N+l are first turned off. The bit lines are then 
equilibrated and array N is then reopened to the appropriate address. The processor 
keeps track of which arrays are active and which rows are selected and which ones are 
off. 

This scheme allows half of the arrays to be selected at the same time. By 
specifically organizing the data such that a large number of adjacent pixels that are 
typically manipulated together are stored within those arrays that can be active 
simultaneously, the memory bandwidth is maximized. For example, the display screen 
can be divided into a bottom half and a top half. Pixel data corresponding to the bottom 
half can be stored in for example all odd numbered arrays and pixel data corresponding 
to the top half can be stored in the even numbered arrays. Since most of the time all 
pixel data that are manipulated as a group would be stored in either even numbered or 
odd numbered arrays, all of those arrays can be accessed at one time, and as many word 
lines as half the number of arrays in the memory can be selected simultaneously. Thus, 



access to read or write the memory is provided at a very high bandwidth. There is also 
less power consumed as the word lines are not turned off and on for every access. 



Accordingly, in one embodiment, the present invention provides a method 
for operating a memory circuit having a plurality of arrays including the steps of (a) 
receiving a command accessing array N, (b) turning off arrays N, N+l and N-l, (c) 
equilibrating bit lines in array N, and (d) turning on array N to access a selected word 
line. 

In another embodiment, the present invention provides a method for 
operating a memory circuit having a plurality of arrays including the steps of (a) 
receiving a first command accessing a row in a first array, (b) turning on the first array 
to allow access to memory cells in that row, and (c) keeping the first array open until it 
receives a second command accessing a new row in the first array. The method further 
includes a step of turning off the first array upon receipt of the second command, and 
turning off a second array adjacent to the first array. 

A better understanding of the nature and advantages of the high speed 
video memory circuit of the present invention may be had with reference to the detailed 
description and the drawings below. 

BRIEF DESCRIPTION OF THE DRAWINGS ' 

Figure 1 is a simplified block diagram of a video card including a memory 
chip and a controller chip; 

Figure 2 is a conceptual block diagram of the multiple-array memory 
circuit according to the present invention; 

Figure 3 is an exemplary circuit schematic of an array enable logic; and 

Figure 4 illustrates exemplary divisions of pixels on a video display screen 
for data storage in the memory arrays to maximize memory bandwidth according the 
present invention. 
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DESCRIPTION OF SPECIFIC EMBODIMENTS 
Referring to Figure 2, there is shown a block diagram of a memory circuit 
200 having multiple arrays Aq to A„ according to one embodiment of the present 
invention. An array A^ may include for example 256 rows of e.g., 1024 memory cells. 
With these exemplary numbers, each array stores 256 Kbits of data. To reduce the size 
of the memory circuit, according to a preferred embodiment of the present invention, 
adjacent arrays share clusters of bit line sense amplifiers S/A, to S/A„. With the example 
used herein, each cluster of sense amplifiers includes 512 individual sense amplifier 
circuits that serve two arrays one on either side. 

Data is written into and read from the memory cells in each array via 
multiple global input/output GIO lines that selectively connect to the bit lines in arrays 
Ao to A„ via column select circuits (not shown). The width of this data bus corresponds 
to the memory I/O bus that connects the memory to the controller on the same die. 
There may be, for example, 128 parallel differential pairs of GIO lines GIO<0> to 
GIO < 127 > that traverse the entire array. In such an exemplary case, there would be a 
corresponding number (128) of write driver and I/O sense amplifier circuits (not shown) 
that connect the memory I/O bus to the 128 pairs of GIO lines. Each array A; further 
connects to an output terminal of an array enable circuit AEj. An array enable circuit 
AE; turns its associated array Aj on or off in response to control signals it receives from 
the video controller (not shown). 

For illustrative purposes, Figure 2 depicts arrays Aq to A„ stacked in a 
single column of arrays. The arrays may in fact be grouped into two or more stacks. 
For example, the memory circuit may include 64 arrays of 1024x256 bits grouped as 
four stacks of 16 arrays each. With one qualification, each array Aj operates almost as 
an independent memory unit via the common memory I/O bus. Because neighboring 
arrays in the memory circuit of the present invention share bit line sense amplifiers, two 
adjacent arrays are not permitted to simultaneously have open rows. Thus, the memory 
circuit allows up to half of the arrays to have open rows at any given time. Using the 
above exemplary numbers, given 64 1024-bit wide arrays, there will be 32 Kbits 
available for column access. According to this invention, once an array is activated on a 
row, it remains active on that: row until it is activated on a different row or until one of 



its neighboring arrays is activated. Thus, repeated accesses can be made to the same 
row of up to 32 already activated arrays without having to go through, a precharge cycle. 
The row addresses of the open rows need not be the same. This technique allows for 
maximizing the memory bandwidth by organizing and storing pixel data in the various 
arrays to take full advantage of the multiple simultaneously active arrays. 

The following exemplary numbers are used herein to describe the 
operation of the memory circuit in greater detail. It is assumed that it will take 20ns to 
precharge (turn off previously on row and equilibrate bit lines), 30ns to select and turn 
on a new row, making a column access possible, and 20ns from the time a column is 
selected until the data is made available, for a maximum access time of 70ns. 
Accordingly, referring to Figure 2, when a new row in an array is selected, regardless 
of whether that array or its two neighboring arrays A^ +1 and were on or off, the total 
access time would be 70ns. This is slightly longer than a total access time of 50ns for 
the prior art memories where the precharge time would not be included in the access 
time. In the prior art circuit, however, a selected row is usually shut down after the^j 
completion of the cycle. Thus, in this circuit if in a subsequent cycle access is made to a 
different column in the same row, the total access time remains 50ns. 

According to the present invention, however, once a specific row in an 
array is activated, that row remains open. Using the exemplary numbers, with the row 
already open, it takes only 20ns to access a new column in that row. The controller may 
open a row in a second non-neighboring array while keeping the row in the first array 
open. A new array can be activated every 10ns, provided it does not conflict with the 
activation in progress with a neighboring array. Continuing in that fashion, up to 32 of 
the 64 arrays may have simultaneously active rows. Thus, data can be accessed and 
transferred at a very fast rate as long as it resides in the various simultaneously active 
rows. 

Referring to Figure 3, there is shown an exemplary circuit diagram for the 
word line enable logic. The output of the exemplary circuit shown in Figure 3 generates 
the word line enable signal WLJEN that activates a pump circuit that boosts the voltage 
level on the selected word line /To further reduce the circuit area, two adjacent memory 




arrays may share the word line boost circuit, since both cannot be active simultaneously. 
O^- The shared word line boost circuit is the subject of a eemm only - asaigncd ; related 

pending United States Patent ^p p lic a ti on N u trt bei 08/ — ; (Atty Dockt ^ fo. 0093 9A~ 

0331), which is hereby incorporated in its entirety for all purposes. 
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Assuming a given word line enable signal WLEN drives a word line 
boost circuit that is shared by arrays Aj and A^,, the word line enable logic must 
implement the following functions: 

10 (1) At time to, WL EN is turned OFF when either one of arrays or Aj +1 is activated. 

(2) At time tjo, WLEN is turned ON when either one of arrays or A i+ , is activated. 

(3) At time to WL EN is turned OFF when array A^ is activated when array A^ was 
active. 

4* (4) At time to WL EN is turned OFF when array A i+2 is activated when array Aj +1 was 

;j;|15 active. 

As described above, when a new row in an array is selected, according to 
i«« the present invention, any open rows in that array are first turned off at time to to allow 

r™ for precharging. Condition (1) is implemented by transistors 308 or 314. When array A ; 

"U 20 or A i+! is activated, signals t^-A^ or VA^ are respectively asserted at time to and 
^ una33 o rtcd before time tao- When signal to-Aj goes high, transistor 308 is turned on 
pulling node 316 down to ground, overpowering latch 317. Signal WL_EN is turned 
low turning off the previously selected word line. Similarly, when signal to-A^, goes 
high, transistor 314 is turned on, pulling node 316 down to ground, and causing WLJEN 
25 to go low. 

Condition (2) refers to the turning on of the new word line in the array at 
time tjo upon completion of the precharge cycle and to access the selected row. This is 
accomplished by NOR gate 300 and PMOS transistor 302. When a logic high is applied 
30 to either one of the inputs t^-A^ or t^-A^, transistor 302 is turned on pulling node 316 
up to Vcc, again powering latch 317. This causes WLJEN to go high activating the new 
selected word line. 



The other two conditions refer, to when a new array (A^ or A^ +2 ) is 
selected at time to adjacent to an already selected array or A i+1 ). In either case, a 
pair of transistors 304/306 or 310/312 are turned on pulling node 316 down to ground, 
and causing WL EN to turn off. 

To maximize the bandwidth, the preferred embodiment of the present 
invention maximizes the likelihood of consecutive accesses to already open rows. This 
can be accomplished by cleverly dividing where in the array pixel data is stored. 
Referring to Figure 4, there is shown a simplified video screen 400 of, for example, 
1024x512 size. The video controller processes pixel data in two modes. When 
displaying the pixels, the screen is scanned horizontally starting from the top line L(0) to 
the bottom line L(511)of the screen. At other times, the controller may processes a, for 
example, 32x32 tile of pixels. 

One example of distributing pixel data to take advantage of the open arrays 
in the memory circuit of the present invention divides the screen into a top half and a 
bottom half. Pixel data corresponding to the top half of the screen are stored in even^\ 
numbered memory arrays, and pixel data corresponding to the bottom half of the screen 
are stored in the odd numbered memory arrays. If each pixel is represented by 32 bits 
of data, then a 1024-bit row in an array can store data corresponding to 32 pixels. 
Accordingly, the first group 32 pixels in line L(0) are stored in row 0 of array 0, the 
second group of 32 pixels in line L(0) are stored in row 0 of array 2, the third group of 
32 pixels in line L(0) are stored in row 0 of array 4, etc. With this type of distribution, 
all the data required to display line L(0) on the screen 400 can be simultaneously 
available in already open rows in even numbered arrays. 

A similar distribution technique is preferably employed for storing each 
32x32 tile of pixels. That. is, the first row of the first tile is stored in Row 0 of Array 0 
as discussed. The second row of the first tile is stored in Row 1 of Array 2, etc. This 
distribution is partially shown in Figure 4. With each row of a given tile in different 
arrays which can be simultaneously open, all the data for a given tile can be in open 
rows. When data is manipulated in tiles, performance significantly improves by fast 
access to the full contents of any, tile. Thus, data is transferred at a significantly faster 



rate as long as consecutive accesses are made to the same set of open rows. Power 
dissipation is also reduced by reducing the number of times arrays are required to be 
turned off and on. It is to be understood that other common screen sizes such as 1024 x 
678 or 1280 x 1024 as well as other numbers of bits per pixel such as 8, 16, or 24 can 
also be arranged with an appropriate memory size to open either a full row of screen 
data or a full tile. 

In conclusion, the present invention provides a memory circuit that is 
particularly suited for video applications. The memory circuit of the present invention 
achieves much higher bandwidth and reduced power consumption by maintaining the 
maximum number of memory arrays open simultaneously. Circuit area is also saved by 
sharing bit line sense amplifiers between adjacent arrays. A specific video memory 
circuit which incorporates an exemplary embodiment of the present invention as well as 
other related circuit techniques is described in greater detail in the article entitled "An 
Embedded Frame Buffer for Graphics Applications, " attached herein as Appendix A. 

While the above is a complete description of specific embodiments of the 
present invention, various modifications, variations and alternatives may be employed. 
The scope of this invention, therefore, should not be limited to the embodiments 
described, and should instead be defined by the following claims. 



